Rule Based Lexical Analysis of Maltese
نویسنده
چکیده
Since no computer based dictionaries exist for Maltese, the only analysis that can be made at present is rule based. The paper describes a rule based system taking into account the mixed origins, (semitic and romance), of Maltese to categorise function and verb words. The system separates the database, the rule formalisms and the rule definitions, enabling easier analysis and quicker changes to rules when necessary. ~ITRODUCHON Maltese is written in Roman script using 30 letters made up of six vowels and twenty four consonants. Of particular interest are two graphemes gh and ie which though written as two letters are actually considered as one grapheme in the language. A recent survey, based on the most well known Maltese dictionary [l], found that the origin of the words is 40% Semitic, 40% Romance (Italian) and 20% English. The linguistic processing has therefore to take into account this mixed structure. The purpose of this work was to analyse the words to try and distinguish function words and verb words in the running text, within a text-to-speech synthesis application. This is necessary to obtain pause information at appropriate word boundaries. Therefore only a partial lexical analysis is being considered. The method used here involves a rule based system with separation between the rule definitions, the rule operations and the input data formalism. Two databases have the function words and the verb words respectively, to compare with the word under test. The rule formalisms are kept in small local tables pertaining to the various rules under test. In this way addition/deletion of the data is completely independent of the rule formalism, and rules can be added or amended independently of each other. MALTESE HNGUISTICS As in other Semitic languages, Maltese words of semitic origin do not have a stem to which affixes are connected, but rather use transfixes. The stem or root is made up of a number of consonants, which can never occur in isolation, and whose order cannot be altered. Transfixes are then added to the root, sometimes also with prefixes and suffixes. Transfixes are made up of a number of vowels and may include operation on consonants such as doubling the middle consonant, (geminate). On the other hand, words of Romance origin follow the usual pattern of a stem and affixes of the inflectional and derivational type to form other lexemes. For example the semitic derivations from k,t,b are kiteb …
منابع مشابه
Restrictive Relative Clauses in Maltese
This paper provides a descriptive overview of restrictive relative clauses (henceforth RRCs) in Maltese, a construction which has received little attention to date and which is poorly described in existing grammars. We outline an LFG approach to the facts we describe bulding on existing analyses, and notably on Asudeh 2004/to appear, as far as the treatment of resumption in RRCs is concerned. F...
متن کاملLEXIE - an Experiment in Lexical Information Extraction
This document investigates the possibility of extracting lexical information automatically from the pages of a printed dictionary of Maltese. An experiment was carried out on a small sample of dictionary entries using hand-crafted rules to parse the entries. Although the results obtained were quite promising, a major problem turned out to errors introduced by OCR and the inconsistent style adop...
متن کاملEstablishing the concurrent validity of a vocabulary checklist for young maltese children.
OBJECTIVE The current literature highlights the research and clinical applications of parental report in investigating the status of language skills in young children. Since language acquisition norms for Maltese have not yet been established, this study attempts to obtain preliminary indications of developmental trends in early lexical development by adapting an established parent-completed vo...
متن کاملCrowd-sourcing evaluation of automatically acquired, morphologically related word groupings
The automatic discovery and clustering of morphologically related words is an important problem with several practical applications. This paper describes the evaluation of word clusters carried out through crowd-sourcing techniques for the Maltese language. The hybrid (Semitic-Romance) nature of Maltese morphology, together with the fact that no large-scale lexical resources are available for M...
متن کاملODL: an Object Description Language for Lexical Information
This paper describes ODL, a description language for lexical information that is being developed within the context of a national project called MLRS (Maltese Language Resource Server) whose goal is to create a national corpus and computational lexicon for the Maltese language. The main aim of ODL is to make the task of the lexicographer easier by allowing lexical specifications to be set out f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998